Search Result

Select

Image colorization algorithm based on foreground semantic information

WU Lidan, XUE Yuyang, TONG Tong, DU Min, GAO Qinquan

Journal of Computer Applications 2021, 41 (7): 2048-2053. DOI: 10.11772/j.issn.1001-9081.2020081184

Abstract （400）

PDF （4553KB）（267）

Save

An image can be divided into foreground part and background part, while the foreground is often the visual center. Due to the large categories and complex situations of foreground part, the image colorization is difficult, thus the foreground part of an image may suffer from poor colorization and detail loss problems. To solve these problems, an image colorization algorithm based on foreground semantic information was proposed to improve the image colorization effect and achieve the purpose of natural overall image color and rich content color. First, the foreground network was used to extract the low-level features and high-level features of the foreground part. Then these features were integrated into the foreground subnetwork to eliminate the influence of background color information and emphasize the foreground color information. Finally, the network was continuously optimized by the generation loss and pixel-level color loss, so as to guide the generation of high-quality images. Experimental results show that after introducing the foreground semantic information, the proposed algorithm improves Peak Signal-to-Noise Ratio (PSNR) and Learned Perceptual Image Patch Similarity (LPIPS), effectively solving the problems of dull color, detail loss and low contrast in the colorization of the central visual regions; compared with other algorithms, the proposed algorithm achieves a more natural colorization effect on the overall image and a significant improvement on the content part.

Reference | Related Articles | Metrics

Select

Bamboo strip surface defect detection method based on improved CenterNet

GAO Qinquan, HUANG Bingcheng, LIU Wenzhe, TONG Tong

Journal of Computer Applications 2021, 41 (7): 1933-1938. DOI: 10.11772/j.issn.1001-9081.2020081167

Abstract （822）

PDF （1734KB）（533）

Save

In bamboo strip surface defect detection, the bamboo strip defects have different shapes and messy imaging environment, and the existing target detection model based on Convolutional Neural Network (CNN) does not take advantage of the neural network when facing such specific data; moreover, the sources of bamboo strips are complicated and there exist other limited conditions, so that it is impossible to collect all types of data, resulting in a small amount of bamboo strip defect data that CNN cannot fully learn. To address these problems, a special detection network aiming at bamboo strip defects was proposed. The basic framework of the proposed network is CenterNet. In order to improve the detection performance of CenterNet in less bamboo strip defect data, an auxiliary detection module based on training from scratch was designed:when the network started training, the CenterNet part that uses the pre-training model was frozen, and the auxiliary detection module was trained from scratch according to the defect characteristics of the bamboo strips; when the loss of the auxiliary detection module stabilized, the module was intergrated with the pre-trained main part by a connection method of attention mechanism. The proposed detection network was trained and tested on the same training sets with CenterNet and YOLO v3 which is currently commonly used in industrial detection. Experimental results show that on the bamboo strip defect detection dataset, the mean Average Precision (mAP) of the proposed method is 16.45 and 9.96 percentage points higher than those of YOLO v3 and CenterNet, respectively. The proposed method can effectively detect the different shaped defects of bamboo strips without increasing too much time consumption, and has a good effect in actual industrial applications.

Reference | Related Articles | Metrics

Select

Blurred video frame interpolation method based on deep voxel flow

LIN Chuanjian, DENG Wei, TONG Tong, GAO Qinquan

Journal of Computer Applications 2020, 40 (3): 819-824. DOI: 10.11772/j.issn.1001-9081.2019081474

Abstract （455）

PDF （1085KB）（438）

Save

Motion blur has an extremely negative effect on video frame interpolation. In order to handle this problem, a novel blurred video frame interpolation method was proposed. Firstly, a multi-task fusion convolutional neural network was proposed, which consists of a deblurring module and a frame interpolation module. In the deblurring module, based on the deep Convolutional Neural Network (CNN) with stack of ResBlocks, motion blur removal of two input frames was implemented by extracting and learning the deep blur features. And the frame interpolation module was used to estimate voxel flow between two consecutive frames after blur removal, then the obtained voxel flow was used to guide the trilinear interpolation of the pixels to synthesize the intermediate frame. Secondly, a large blurred video simulation dataset was made, and a “first separate and then combine” “from coarse to fine” training strategy was proposed, experimental results show that this strategy promotes the effective convergence of the multi-task fusion network. Finally, compared with the simple combination of the state-of-the-art deblurring and frame interpolation algorithms, experimental metrics show that the intermediate frame synthesized by the proposed method has the peak-to-noise ratio increased by 1.41 dB, the structural similarity improved by 0.020, and the interpolation error decreased by 1.99， at least. Visual comparison and reconstructed sequences show that the proposed model performs good frame rate up conversion effect for blurred videos, in other words, two blurred consecutive frames can be reconstructed end-to-end to three sharp and visually smooth frames by the model.

Reference | Related Articles | Metrics

Select

Video compression artifact removal algorithm based on adaptive separable convolution network

NIE Kehui, LIU Wenzhe, TONG Tong, DU Min, GAO Qinquan

Journal of Computer Applications 2019, 39 (5): 1473-1479. DOI: 10.11772/j.issn.1001-9081.2018081801

Abstract （526）

PDF （1268KB）（333）

Save

The existing optical flow estimation methods, which are frequently used in video quality enhancement and super-resolution reconstruction tasks, can only estimate the linear motion between pixels. In order to solve this problem, a new multi-frame compression artifact removal network architecture was proposed. The network consisted of motion compensation module and compression artifact removal module. With the traditional optical flow estimation algorithms replaced with the adaptive separable convolution, the motion compensation module was able to handle with the curvilinear motion between pixels, which was not able to be well solved by optical flow methods. For each video frame, a corresponding convolutional kernel was generated by the motion compensation module based on the image structure and the local displacement of pixels. After that, motion offsets were estimated and pixels were compensated in the next frame by means of local convolution. The obtained compensated frame and the original next frame were combined together as input for the compression artifact removal module. By fusing different pixel information of the two frames, the compression artifacts of the original frame were removed. Compared with the state-of-the-art Multi-Frame Quality Enhancement (MFQE) algorithm on the same training and testing datasets, the proposed network has the improvement of Peak Signal-to-Noise Ratio (Δ PSNR) increased by 0.44 dB at most and 0.32 dB on average. The experimental results demonstrate that the proposed network performs well in removing video compression artifacts.

Reference | Related Articles | Metrics

Select

Compression method of super-resolution convolutional neural network based on knowledge distillation

GAO Qinquan, ZHAO Yan, LI Gen, TONG Tong

Journal of Computer Applications 2019, 39 (10): 2802-2808. DOI: 10.11772/j.issn.1001-9081.2019030516

Abstract （731）

PDF （1103KB）（712）

Save

Aiming at the deep structure and high computational complexity of current network models based on deep learning for super-resolution image reconstruction, as well as the problem that the networks can not operate effectively on resource-constrained devices caused by the high storage space requirement for the network models, a super-resolution convolutional neural network compression method based on knowledge distillation was proposed. This method utilizes a teacher network with large parameters and good reconstruction effect as well as a student network with few parameters and poor reconstruction effect. Firstly the teacher network was trained; then knowledge distillation method was used to transfer knowledge from teacher network to student network; finally the reconstruction effect of the student network was improved without changing the network structure and the parameters of the student network. The Peak Signal-to-Noise Ratio (PSNR) was used to evaluate the quality of reconstruction in the experiments. Compared to the student network without knowledge distillation method, the student network using the knowledge distillation method has the PSNR increased by 0.53 dB, 0.37 dB, 0.24 dB and 0.45 dB respectively on four public test sets when the magnification times is 3. Without changing the structure of student network, the proposed method significantly improves the super-resolution reconstruction effect of the student network.

Reference | Related Articles | Metrics

Select

Design of augmented reality navigation simulation system for pelvic minimally invasive surgery based on stereoscopic vision

GAO Qinquan, HUANG Weiping, DU Min, WEI Mengyu, KE Dongzhong

Journal of Computer Applications 2018, 38 (9): 2660-2665. DOI: 10.11772/j.issn.1001-9081.2018020335

Abstract （537）

PDF （1132KB）（346）

Save

Minimally invasive endoscopic surgery always remains a challenge due to the complexity of the anatomical location and the limitations of endoscopic vision. An Augmented Reality (AR) navigation system was designed for simulation of pelvic minimally invasive surgery. Firstly, a 3D model of pelvis which was segmented and reconstructed from the preoperative CT (Computed Tomography) was textured mapping with the real pelvic surgical video, and then a surgical video with the ground truth pose was simulated. The blank model was initially registered with the intraoperative video by a 2D/3D registration based on color consistency of visible surface points. After that, an accurate tracking of intraoperative endoscopy was performed using a stereoscopic tracking algorithm. According to the multi-DOFs (Degree Of Freedoms) transformation matrix of endoscopy, the preoperative 3D model could then be fused to the intraoperative vision to achieve an AR navigation. The experimental results show that the root mean square error of the estimated trajectory compared to the ground truth is 2.3933 mm, which reveals that the system can achieve a good AR display for visual navigation.

Reference | Related Articles | Metrics

Select

Convolutional neural network based method for diagnosis of Alzheimer's disease

LIN Weiming, GAO Qinquan, DU Min

Journal of Computer Applications 2017, 37 (12): 3504-3508. DOI: 10.11772/j.issn.1001-9081.2017.12.3504

Abstract （919）

PDF （844KB）（876）

Save

The Alzheimer's Disease (AD) usually leads to atrophy of hippocampus region. According to the characteristic, a Convolutional Neural Network (CNN) based method was proposed for the diagnosis of AD by using the hippocampu region in brain Magnetic Resonance Imaging (MRI). All the test data were got from the ADNI database including 188 AD and 229 Normal Control (NC). Firstly, all the brain MRI were preprocessed by skull stripping and aligned to a template space. Secondly, a linear regression model was used for age correction of brain aging atrophy. Then, after preprocessing, multiple 2.5D images were extracted from the hippocampus region in the 3D brain image for each object. Finally, the CNN was used to train and recognize the extracted 2.5D images, and the recognition results of the same object were used for the joint diagnosis of AD. The experiments were carried out by using multiple ten-fold cross validation methods. The experimental results show that the average recognition accuracy of the proposed method reaches 88.02%. The comparison results show that, compared with Stacked Auto-Encoder (SAE) method, the proposed method has improved the diagnosis effect of AD in the case of only using MRI.

Reference | Related Articles | Metrics